Final Project: Neural Radiance Field!

Part 1: Fit a Neural Field to a 2D Image

The first task was to implement an Multilayer Perceptron (MLP) network with Sinusoidal Positional Encoding (PE) in accordance with the architecture in the figure down below.

The network has four linear layers each with 256 channels. The output is a the dimensional with the channels R,G and B which constructs the image.

The network was first trained on the "Fox image" which is displayed below.

Training with the following hyperparameters: L = 10, learning rate = 0.01, batch size = 1000 and iterations = 3000 yielded the results which are displayed below.

Training with these hyperparameters: L = 15, learning rate = 0.001, batch size = 1000 and iterations = 1000 yielded the results displayed below.

Then the network was trained on an image of Tiger Woods which is displayed below.

Training with the following hyperparameters: L = 10, learning rate = 0.01, batch size = 1000 and iterations = 3000 yielded the results which are displayed below.

Training with these hyperparameters: L = 15, learning rate = 0.001, batch size = 1000 and iterations = 1000 yielded the results displayed below.

Part 2: Fit a Neural Radiance Field from Multi-view Images

In this part camera and pixel data was converted into 3D rays for neural rendering. Points are transformed from camera to world space using extrinsic matrices and from pixels to camera space using intrinsic matrices. Rays are created with origins at the camera and directions calculated by normalizing points in world space. Rays are sampled into 3D points, with small random shifts added for better training. A dataloader combines multiview image data, producing ray origins, directions, and pixel colors. Code was provided to make sure the implementation was done correctly. The result from the implementation check is displayed below.

Having the samples in 3D, the neural radiance field (NeRF) was implemented in accordance with the architecture presented below.

Lastly, the volume rendering equation was implemented which is presented below.

The results from training are presented below.

Bells & Whistles

For bells & whistles the depth rendering was implemented and the results are shown below.

Back to Main Page